Estimating Policy Functions in Payments Systems Using Reinforcement Learning

نویسندگان

چکیده

Nous montrons que les techniques d’apprentissage par renforcement permettent d’estimer fonctions de reaction optimale des banques qui participent aux systemes paiement grande valeur – un jeu strategique du monde reel caracterise informations incompletes.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Reinforcement Using Supervised Learning for Policy Generalization

Applying reinforcement learning in large Markov Decision Process (MDP) is an important issue for solving very large problems. Since the exact resolution is often intractable, many approaches have been proposed to approximate the value function (for example, TD-Gammon (Tesauro 1995)) or to approximate directly the policy by gradient methods (Russell & Norvig 2002). Such approaches provide a poli...

متن کامل

Payments systems and monetary policy

A dynamic spatial model is constructed where there is a role for money and for centralized payments arrangements, and where there are aggregate fluctuations driven by fluctuations in aggregate productivity. With decentralized monetary exchange and no centralized payments arrangements, there is price level indeterminacy, and the equilibrium allocation is inefficient. A private clearinghouse arra...

متن کامل

Transfer of task representation in reinforcement learning using policy-based proto-value functions

Reinforcement Learning research is traditionally devoted to solve single-task problems. This means that, anytime a new task is faced, learning must be restarted from scratch. Recently, several studies have addressed the issues of reusing the knowledge acquired in solving previous related tasks by transferring information about policies and value functions. In this paper we analyze the use of pr...

متن کامل

Reinforcement Learning Based PID Control of Wind Energy Conversion Systems

In this paper an adaptive PID controller for Wind Energy Conversion Systems (WECS) has been developed. Theadaptation technique applied to this controller is based on Reinforcement Learning (RL) theory. Nonlinearcharacteristics of wind variations as plant input, wind turbine structure and generator operational behaviordemand for high quality adaptive controller to ensure both robust stability an...

متن کامل

On-policy concurrent reinforcement learning

When an agent learns in a multiagent environment, the payoff it receives is dependent on the behavior of the other agents. If the other agents are also learning, its reward distribution becomes non-stationary. This makes learning in multiagent systems more difficult than singleagent learning. Prior attempts at value-function based learning in such domains have used offpolicy Q-learning that do ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Social Science Research Network

سال: 2022

ISSN: ['1556-5068']

DOI: https://doi.org/10.2139/ssrn.4226484